Skip to content

refactor(trace/semantics): lift conditional gRPC status fallback into mappings.json#50397

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 15 commits into
mainfrom
feat/ehu/OTEL-2718-semantics-conditional-grpc-status
May 8, 2026
Merged

refactor(trace/semantics): lift conditional gRPC status fallback into mappings.json#50397
gh-worker-dd-mergequeue-cf854d[bot] merged 15 commits into
mainfrom
feat/ehu/OTEL-2718-semantics-conditional-grpc-status

Conversation

@edwardhu-datadog
Copy link
Copy Markdown
Contributor

@edwardhu-datadog edwardhu-datadog commented May 5, 2026

What does this PR do?

Moves the rpc.response.status_code → gRPC special case from transform.go into the semantics registry as when clauses on TagInfo. The transform package now has zero gRPC-specific code — LookupString(reg, spanAccessor, ConceptGRPCStatusCode) does it all.

Motivation

Describe how you validated your changes

go test ./pkg/trace/semantics/... -count=1     # 8 sub-tests in TestGRPCStatusCodeConditionalFallback
go test ./pkg/trace/transform/... -count=1     # no behavioral change
go test ./pkg/trace/otel/stats/... -count=1    # OTLP gRPC fixture still populates rpc.grpc.status_code

CI handles the broader integration suites.

Additional Notes

Known minor divergence from #50272: the merged PR's getOTelGRPCStatusCode walked per-source (span-then-resource, both keys per source); our registry walks per-key (OTelSpanAccessor does span-then-resource per key independently). Both check span and resource — only the axis ordering differs.

I don't see this as an issue because this is how we handle HTTPStatusCodes today

Pre-PR (getOTelGRPCStatusCode):                           
  walk per-source first, picking newer key within a source:
    span: rpc.system.name? rpc.system?  → first hit wins
    res:  rpc.system.name? rpc.system?  → first hit wins (only if span empty)

Post-PR (registry, OTelSpanAccessor):
  walk per-key first, picking span over resource for each key:
    rpc.system.name: span? res?  → first hit wins
    rpc.system:      span? res?  → first hit wins (only if rpc.system.name empty)
File What changed
semantics/semantics.go Condition type + When []Condition on TagInfo
semantics/lookup.go conditionMatches / conditionsMatch helpers; 1-line guard in each Lookup*
semantics/mappings.json 4 conditional rpc.response.status_code rows under rpc.grpc.status_code; rpc.system.name added to rpc.system chain
semantics/lookup_test.go New TestGRPCStatusCodeConditionalFallback (8 sub-tests)
transform/transform.go Delete getOTelGRPCStatusCode; call LookupString directly

Design:

  • Conditions read attributes by exact key — no transitive concept resolution. To gate on a renamed attribute, list one fallback row per condition attribute.
  • The four rpc.response.status_code rows are: (string / int64) × (rpc.system.name=grpc / rpc.system=grpc AND rpc.system.name absent). The absence guard preserves the pre-refactor "new key wins" behavior.
  • Condition schema: {attribute, present, eq}. Conditions in a when array are AND-ed.

version: "1.39.0" provenance: both new attributes are from OTel semconv v1.39.0 (2026-01-12); neither is in v1.24.0. Introducing commits: rpc.response.status_code (cd5b45c6, 2025-12-04) and rpc.system.name (18db3e44, 2025-12-09).

Release note: none added — user-visible behavior matches #50272's. The existing apm-otlp-grpc-status-code-bf3137b34cb82136.yaml reno still applies.

@datadog-datadog-prod-us1-2
Copy link
Copy Markdown
Contributor

datadog-datadog-prod-us1-2 Bot commented May 5, 2026

🎯 Code Coverage (details)
Patch Coverage: 73.68%
Overall Coverage: 50.34% (+0.17%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: fd8b5e5 | Docs | Datadog PR Page | Give us feedback!

@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 5, 2026

Files inventory check summary

File checks results against ancestor d76c9679:

Results for datadog-agent_7.80.0~devel.git.622.fd8b5e5.pipeline.112189231-1_amd64.deb:

No change detected

@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 5, 2026

Static quality checks

✅ Please find below the results from static quality gates
Comparison made with ancestor d76c967
📊 Static Quality Gates Dashboard
🔗 SQG Job

Successful checks

Info

Quality gate Change Size (prev → curr → max)
agent_deb_amd64 +12.72 KiB (0.00% increase) 742.518 → 742.530 → 750.310
agent_deb_amd64_fips +8.72 KiB (0.00% increase) 700.545 → 700.553 → 702.690
agent_heroku_amd64 +12.72 KiB (0.00% increase) 309.246 → 309.259 → 313.960
agent_rpm_amd64 +12.72 KiB (0.00% increase) 742.501 → 742.514 → 750.280
agent_rpm_amd64_fips +8.72 KiB (0.00% increase) 700.528 → 700.537 → 702.670
agent_rpm_arm64 +12.72 KiB (0.00% increase) 720.414 → 720.427 → 724.050
agent_rpm_arm64_fips +8.72 KiB (0.00% increase) 681.529 → 681.537 → 684.460
agent_suse_amd64 +12.72 KiB (0.00% increase) 742.501 → 742.514 → 750.280
agent_suse_amd64_fips +8.72 KiB (0.00% increase) 700.528 → 700.537 → 702.670
agent_suse_arm64 +12.72 KiB (0.00% increase) 720.414 → 720.427 → 724.050
agent_suse_arm64_fips +8.72 KiB (0.00% increase) 681.529 → 681.537 → 684.460
docker_agent_amd64 +12.72 KiB (0.00% increase) 802.707 → 802.719 → 805.870
docker_agent_arm64 +12.72 KiB (0.00% increase) 805.451 → 805.463 → 809.730
docker_agent_jmx_amd64 +12.72 KiB (0.00% increase) 993.627 → 993.639 → 996.590
docker_agent_jmx_arm64 +12.72 KiB (0.00% increase) 985.149 → 985.162 → 989.410
docker_cluster_agent_amd64 +8.72 KiB (0.00% increase) 206.484 → 206.493 → 207.600
docker_host_profiler_amd64 +14.67 KiB (0.00% increase) 302.171 → 302.186 → 315.800
docker_host_profiler_arm64 +5.34 KiB (0.00% increase) 313.666 → 313.671 → 327.400
14 successful checks with minimal change (< 2 KiB)
Quality gate Current Size
docker_cluster_agent_arm64 220.509 MiB
docker_cws_instrumentation_amd64 7.142 MiB
docker_cws_instrumentation_arm64 6.689 MiB
docker_dogstatsd_amd64 39.511 MiB
docker_dogstatsd_arm64 37.691 MiB
dogstatsd_deb_amd64 30.169 MiB
dogstatsd_deb_arm64 28.295 MiB
dogstatsd_rpm_amd64 30.169 MiB
dogstatsd_suse_amd64 30.169 MiB
iot_agent_deb_amd64 44.521 MiB
iot_agent_deb_arm64 41.494 MiB
iot_agent_deb_armhf 42.230 MiB
iot_agent_rpm_amd64 44.522 MiB
iot_agent_suse_amd64 44.522 MiB

@cit-pr-commenter-54b7da
Copy link
Copy Markdown

cit-pr-commenter-54b7da Bot commented May 6, 2026

Regression Detector

Regression Detector Results

Metrics dashboard
Target profiles
Run ID: f105c3dc-f02b-495e-a7da-1f5e16bd0d07

Baseline: 9de9473
Comparison: 8cf3b53
Diff

Optimization Goals: ✅ No significant changes detected

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf experiment goal Δ mean % Δ mean % CI trials links
docker_containers_cpu % cpu utilization -1.57 [-4.45, +1.30] 1 Logs

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI trials links
quality_gate_logs % cpu utilization +1.74 [+0.76, +2.72] 1 Logs bounds checks dashboard
ddot_metrics_sum_cumulative memory utilization +0.36 [+0.20, +0.52] 1 Logs
ddot_metrics_sum_cumulativetodelta_exporter memory utilization +0.32 [+0.09, +0.56] 1 Logs
otlp_ingest_metrics memory utilization +0.31 [+0.16, +0.47] 1 Logs
otlp_ingest_logs memory utilization +0.26 [+0.16, +0.36] 1 Logs
quality_gate_idle_all_features memory utilization +0.16 [+0.12, +0.20] 1 Logs bounds checks dashboard
tcp_syslog_to_blackhole ingress throughput +0.13 [-0.06, +0.31] 1 Logs
file_to_blackhole_1000ms_latency egress throughput +0.03 [-0.42, +0.49] 1 Logs
uds_dogstatsd_to_api_v3 ingress throughput +0.01 [-0.19, +0.21] 1 Logs
tcp_dd_logs_filter_exclude ingress throughput -0.00 [-0.09, +0.09] 1 Logs
file_to_blackhole_0ms_latency egress throughput -0.01 [-0.55, +0.53] 1 Logs
uds_dogstatsd_to_api ingress throughput -0.02 [-0.21, +0.18] 1 Logs
file_to_blackhole_100ms_latency egress throughput -0.03 [-0.17, +0.12] 1 Logs
file_to_blackhole_500ms_latency egress throughput -0.04 [-0.45, +0.36] 1 Logs
ddot_logs memory utilization -0.05 [-0.12, +0.02] 1 Logs
ddot_metrics_sum_delta memory utilization -0.10 [-0.28, +0.09] 1 Logs
uds_dogstatsd_20mb_12k_contexts_20_senders memory utilization -0.20 [-0.24, -0.15] 1 Logs
quality_gate_idle memory utilization -0.20 [-0.25, -0.15] 1 Logs bounds checks dashboard
file_tree memory utilization -0.42 [-0.47, -0.37] 1 Logs
ddot_metrics memory utilization -0.44 [-0.63, -0.24] 1 Logs
docker_containers_memory memory utilization -0.48 [-0.58, -0.38] 1 Logs
quality_gate_metrics_logs memory utilization -0.75 [-1.00, -0.50] 1 Logs bounds checks dashboard
docker_containers_cpu % cpu utilization -1.57 [-4.45, +1.30] 1 Logs

Bounds Checks: ✅ Passed

perf experiment bounds_check_name replicates_passed observed_value links
docker_containers_cpu simple_check_run 10/10 707 ≥ 26
docker_containers_memory memory_usage 10/10 244.71MiB ≤ 370MiB
docker_containers_memory simple_check_run 10/10 723 ≥ 26
file_to_blackhole_0ms_latency memory_usage 10/10 0.16GiB ≤ 1.20GiB
file_to_blackhole_0ms_latency missed_bytes 10/10 0B = 0B
file_to_blackhole_1000ms_latency memory_usage 10/10 0.21GiB ≤ 1.20GiB
file_to_blackhole_1000ms_latency missed_bytes 10/10 0B = 0B
file_to_blackhole_100ms_latency memory_usage 10/10 0.17GiB ≤ 1.20GiB
file_to_blackhole_100ms_latency missed_bytes 10/10 0B = 0B
file_to_blackhole_500ms_latency memory_usage 10/10 0.18GiB ≤ 1.20GiB
file_to_blackhole_500ms_latency missed_bytes 10/10 0B = 0B
quality_gate_idle intake_connections 10/10 3 ≤ 4 bounds checks dashboard
quality_gate_idle memory_usage 10/10 140.89MiB ≤ 147MiB bounds checks dashboard
quality_gate_idle_all_features intake_connections 10/10 3 ≤ 4 bounds checks dashboard
quality_gate_idle_all_features memory_usage 10/10 471.23MiB ≤ 495MiB bounds checks dashboard
quality_gate_logs intake_connections 10/10 4 ≤ 6 bounds checks dashboard
quality_gate_logs memory_usage 10/10 179.15MiB ≤ 195MiB bounds checks dashboard
quality_gate_logs missed_bytes 10/10 0B = 0B bounds checks dashboard
quality_gate_metrics_logs cpu_usage 10/10 348.53 ≤ 2000 bounds checks dashboard
quality_gate_metrics_logs intake_connections 10/10 3 ≤ 6 bounds checks dashboard
quality_gate_metrics_logs memory_usage 10/10 368.07MiB ≤ 430MiB bounds checks dashboard
quality_gate_metrics_logs missed_bytes 10/10 0B = 0B bounds checks dashboard

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

  • ✅ = significantly better comparison variant performance
  • ❌ = significantly worse comparison variant performance
  • ➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

  1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.

  2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.

  3. Its configuration does not mark it "erratic".

CI Pass/Fail Decision

Passed. All Quality Gates passed.

  • quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_idle_all_features, bounds check intake_connections: 10/10 replicas passed. Gate passed.
  • quality_gate_idle, bounds check intake_connections: 10/10 replicas passed. Gate passed.
  • quality_gate_idle, bounds check memory_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
  • quality_gate_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
  • quality_gate_metrics_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_metrics_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
  • quality_gate_metrics_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
  • quality_gate_metrics_logs, bounds check cpu_usage: 10/10 replicas passed. Gate passed.

Base automatically changed from feat/ehu/OTEL-2718-otlp-grpc-status-trace-metrics to main May 6, 2026 16:01
edwardhu-datadog and others added 8 commits May 6, 2026 12:04
… mappings.json

Move the rpc.system.name/rpc.system → rpc.response.status_code special case
out of transform.getOTelGRPCStatusCode and into the semantic registry as
`when` clauses on TagInfo. This removes the gRPC-specific helper from the
transform package and lets future conditional fallbacks be expressed
declaratively in mappings.json.

Conditions read attributes by exact key (no transitive concept resolution)
— to gate on a renamed attribute, list one fallback row per condition
attribute. This matches the dd-source trace-semantics precedent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Drop the float64-precision sub-test from TestLookupFloat64; it tests a
  structural property unrelated to conditional fallback that this PR
  doesn't touch.
- Drop the conditional.eq cases; the gRPC test cases already exercise Eq
  accept/reject on a realistic concept.
- Drop the conditional.present/conditional.absent cases; Present is
  unused in mappings.json today and the AND test covers Present:true
  implicitly.
- Rename testBoolPtr to boolPtr (the _test.go suffix already conveys it
  is test-only).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nal check inline

Drop the lookupWithConditions/readTag/readFloat64Tag/readInt64Tag
extraction. The conditional fallback feature only needs a single
`if !conditionsMatch(...) { continue }` line at the top of each loop;
extracting per-type readers and a generic helper was orthogonal
churn that obscured the diff.

Net diff vs main: +37 lines, 0 deletions to existing code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The pre-refactor getOTelGRPCStatusCode used GetOTelAttrFromEitherMap with
keys ("rpc.system.name", "rpc.system"), which returns the first non-empty
key — so when an SDK sets rpc.system.name=jsonrpc and (incorrectly) also
rpc.system=grpc, the new key wins and rpc.response.status_code is rejected.

Without this fix the two independent registry rows would each evaluate
against their own attribute, and the legacy rpc.system=grpc row would
match — accepting a status code that pre-PR rejected.

Gate the legacy rows on rpc.system.name being absent so the new semconv
key remains authoritative when both are present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- TestConditionalLookup now uses NewEmbeddedRegistry() against the real
  mappings.json instead of a hand-rolled fake. Removes ~50 lines of
  Provider/Type struct-literal boilerplate and the synthetic
  conditional.and concept (the rpc.system+rpc.system.name two-condition
  AND in production gives implicit AND coverage). boolPtr helper is no
  longer needed.
- Collapse the duplicated `c.Present != nil` check at the tail of
  conditionMatches into a single return.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…C keys

rpc.response.status_code and rpc.system.name were both first added to
OTel semantic-conventions in v1.39.0 (Jan 2026), not v1.24.0 (Dec 2023).

Verified against the upstream repo:
  - rpc.response.status_code: commit cd5b45c6 (2025-12-04), .chloggen/rpc-status-code.yaml
    "Deprecate per-rpc status code attributes in favor of rpc.response.status_code"
  - rpc.system.name: commit 18db3e44 (2025-12-09), .chloggen/rpc-system-name.yaml
    "Align RPC conventions with naming guidelines. Renames rpc.system to rpc.system.name."

Both commits land in v1.39.0 (published 2026-01-12). model/registry/rpc.yaml
at v1.24.0 has neither attribute.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pull the "no declared predicate" default to the top with a comment, then
check each declared predicate independently. Removes the trailing
`return c.Present != nil || found` boolean fold whose meaning was easy
to mis-read.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The single-letter local was idiomatic but unclear in context. Both
locals now self-document: `value := accessor.GetString(c.Attribute)`
and `found := value != ""`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@edwardhu-datadog edwardhu-datadog force-pushed the feat/ehu/OTEL-2718-semantics-conditional-grpc-status branch from 791811e to f5d39d4 Compare May 6, 2026 16:07
@dd-octo-sts dd-octo-sts Bot added internal Identify a non-fork PR team/agent-apm trace-agent team/opentelemetry OpenTelemetry team labels May 6, 2026
@github-actions github-actions Bot added the medium review PR review might take time label May 6, 2026
edwardhu-datadog and others added 3 commits May 6, 2026 16:38
…odeConditionalFallback

The test only exercises the rpc.grpc.status_code mapping and its four
conditional rows, not the conditional machinery in general. Updated name
matches the actual scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…onMatches

A Condition with neither Present nor Eq set is unused in mappings.json
today, and the special-case "fall back to presence check" branch was
the most confusing line in the function (per reviewer feedback).

Treat an empty Condition as a no-op (always true), consistent with how
an empty `when: []` array also returns true (zero-iteration loop is
vacuously true). Authors who want a presence check write
{"attribute": "X", "present": true} explicitly.

Function shrinks from 14 to 11 lines; doc comment now states the
semantics explicitly without referring to a degenerate path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…predicate

The vast majority of fallback rows in mappings.json have no `when`
clause; calling conditionsMatch on those rows just iterates an empty
slice and returns true. Adding `len(tag.When) > 0 &&` at each call site
makes the conditional structure visible at the lookup site (most rows
unconditional, some conditional) and avoids the dead function call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@edwardhu-datadog edwardhu-datadog marked this pull request as ready for review May 6, 2026 21:20
@edwardhu-datadog edwardhu-datadog requested review from a team as code owners May 6, 2026 21:20
@edwardhu-datadog edwardhu-datadog requested review from a team, IbraheemA, jade-guiton-dd and songy23 and removed request for a team May 6, 2026 21:20
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0ffd9f7abe

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread pkg/trace/semantics/mappings.json Outdated
"rpc.system": {
"canonical": "rpc.system",
"fallbacks": [
{"name": "rpc.system.name", "provider": "otel", "version": "1.39.0", "type": "string"},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add release note for rpc.system.name operation naming

When operation/resource name v2 is enabled, getOTelOperationNameV2 reads ConceptRPCSystem (see pkg/trace/transform/otelutil.go:444), so adding rpc.system.name here makes OTLP spans that only carry the new semconv key produce RPC operation names such as grpc.server.request instead of falling back to the instrumentation-scope name. That is a user-visible behavior change separate from the existing gRPC-status-code reno, so the PR should document it or avoid changing this fallback in the status-code refactor.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — verified and applied. Reverted the rpc.system.name addition to the rpc.system concept's fallback chain in b16f23e.

The gRPC conditional rows on rpc.grpc.status_code read rpc.system.name/rpc.system as raw attribute keys in their when clauses, so they don't depend on the rpc.system concept's fallback chain. Removing this entry has zero effect on gRPC status routing while restoring getOTelOperationNameV2's pre-PR behavior — so no release-note change is needed for this PR. The rpc.system.name promotion is a separate refactor; I've restored the future-work bullet in semantics.go to track it.

Codex bot flagged that adding rpc.system.name to the rpc.system concept's
fallback chain has a side effect via getOTelOperationNameV2
(transform/otelutil.go:444), which uses LookupString(ConceptRPCSystem)
to drive RPC operation naming. With the fallback added, OTLP spans
carrying only the new semconv key would have produced operation names
like "grpc.server.request" instead of falling back to the
instrumentation-scope name — a user-visible change unrelated to the
gRPC status-code refactor and not covered by the existing reno.

Revert the rpc.system concept's fallback chain to its main state. The
gRPC conditional rows in mappings.json read rpc.system.name and
rpc.system as raw attribute keys in their `when` clauses, not via the
concept fallback chain, so this revert has zero effect on the gRPC
status-code routing.

Restore the corresponding future-work bullet in semantics.go now that
the canonical/fallback promotion is again pending.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@edwardhu-datadog edwardhu-datadog force-pushed the feat/ehu/OTEL-2718-semantics-conditional-grpc-status branch from 0ffd9f7 to b16f23e Compare May 6, 2026 21:31
@edwardhu-datadog
Copy link
Copy Markdown
Contributor Author

@codex review

@edwardhu-datadog edwardhu-datadog added the qa/rc-required Only for a PR that requires validation on the Release Candidate label May 6, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4d956587a9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

{"name": "grpc.status.code", "provider": "datadog", "type": "int64"},
{"name": "rpc.response.status_code", "provider": "otel", "version": "1.39.0", "type": "string", "when": [{"attribute": "rpc.system.name", "eq": "grpc"}]},
{"name": "rpc.response.status_code", "provider": "otel", "version": "1.39.0", "type": "int64", "when": [{"attribute": "rpc.system.name", "eq": "grpc"}]},
{"name": "rpc.response.status_code", "provider": "otel", "version": "1.39.0", "type": "string", "when": [{"attribute": "rpc.system", "eq": "grpc"}, {"attribute": "rpc.system.name", "present": false}]},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve span-level legacy gRPC fallback before resource keys

When an OTLP span has legacy rpc.system=grpc on the span but the resource includes any non-empty rpc.system.name, this present:false condition fails because OTelSpanAccessor.GetString("rpc.system.name") also checks resource attributes. The removed helper checked both rpc.system.name and rpc.system in span attributes before consulting resource attributes, so mixed-version payloads like span-level rpc.system=grpc plus resource-level rpc.system.name=jsonrpc used to emit rpc.grpc.status_code and now silently drop it.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

@edwardhu-datadog edwardhu-datadog May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, but this only triggers when span and resource disagree about the RPC system — like span saying rpc.system=grpc while resource says rpc.system.name=jsonrpc. We don't have evidence any SDK actually does this; happy to revisit if we see it surface in real traffic.

Also, this conforms with how we handle HTTPStatusCodes today

@edwardhu-datadog edwardhu-datadog added the changelog/no-changelog No changelog entry needed label May 6, 2026
Comment thread pkg/trace/semantics/lookup.go Outdated
return false
}
if c.Eq != nil {
if !found || value != fmt.Sprint(c.Eq) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running fmt.Sprint on every condition match seems unnecessary? I would suggest changing Condition.Eq from any to *string, or precompute the stringified form at registry load time.

Copy link
Copy Markdown
Contributor Author

@edwardhu-datadog edwardhu-datadog May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — applied as *string in 0afceb1.

@aaylward-dd
Copy link
Copy Markdown

cc @mialcheng

Per @ichinaski's review feedback. Eq is only ever set to a string in
mappings.json (verified: 4 occurrences, all "grpc"), and conditionMatches
only does string comparison anyway. Typing it as *string lets us drop
the per-call fmt.Sprint allocation in conditionMatches and surfaces a
loud unmarshal error if anyone tries to write a non-string eq value in
JSON, which is the correct fail-loud behavior for the schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@edwardhu-datadog edwardhu-datadog force-pushed the feat/ehu/OTEL-2718-semantics-conditional-grpc-status branch from 4d95658 to 0afceb1 Compare May 7, 2026 15:19
Copy link
Copy Markdown
Contributor

@mialcheng mialcheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit 8cf3b53 into main May 8, 2026
239 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot deleted the feat/ehu/OTEL-2718-semantics-conditional-grpc-status branch May 8, 2026 17:15
@github-actions github-actions Bot added this to the 7.80.0 milestone May 8, 2026
chouetz pushed a commit that referenced this pull request May 13, 2026
… mappings.json (#50397)

### What does this PR do?

Moves the `rpc.response.status_code` → gRPC special case from `transform.go` into the semantics registry as `when` clauses on `TagInfo`. The `transform` package now has zero gRPC-specific code — `LookupString(reg, spanAccessor, ConceptGRPCStatusCode)` does it all.

### Motivation

- [DSEM-807](https://datadoghq.atlassian.net/browse/DSEM-807) — Semantics Core team asked for conditional fallback support in the trace-agent registry.
- Follow-up to #50272 (merged) — addresses the [reviewer ask](#50272 (comment)) that the gRPC predicate belongs in the registry, not the transformer.
- Follows the [dd-source trace-semantics precedent](https://github.com/DataDog/dd-source/blob/main/domains/semantic-core/libs/go/trace-semantics/mappings.json#L803), which already expresses the same predicate.

### Describe how you validated your changes

```
go test ./pkg/trace/semantics/... -count=1     # 8 sub-tests in TestGRPCStatusCodeConditionalFallback
go test ./pkg/trace/transform/... -count=1     # no behavioral change
go test ./pkg/trace/otel/stats/... -count=1    # OTLP gRPC fixture still populates rpc.grpc.status_code
```

CI handles the broader integration suites.

### Additional Notes

**Known minor divergence from #50272:** the merged PR's `getOTelGRPCStatusCode` walked per-source (span-then-resource, both keys per source); our registry walks per-key (`OTelSpanAccessor` does span-then-resource per key independently). Both check span and resource — only the axis ordering differs.

I don't see this as an issue because this is how we handle HTTPStatusCodes today

```
Pre-PR (getOTelGRPCStatusCode):                           
  walk per-source first, picking newer key within a source:
    span: rpc.system.name? rpc.system?  → first hit wins
    res:  rpc.system.name? rpc.system?  → first hit wins (only if span empty)

Post-PR (registry, OTelSpanAccessor):
  walk per-key first, picking span over resource for each key:
    rpc.system.name: span? res?  → first hit wins
    rpc.system:      span? res?  → first hit wins (only if rpc.system.name empty)
```


| File | What changed |
|---|---|
| `semantics/semantics.go` | `Condition` type + `When []Condition` on `TagInfo` |
| `semantics/lookup.go` | `conditionMatches` / `conditionsMatch` helpers; 1-line guard in each `Lookup*` |
| `semantics/mappings.json` | 4 conditional `rpc.response.status_code` rows under `rpc.grpc.status_code`; `rpc.system.name` added to `rpc.system` chain |
| `semantics/lookup_test.go` | New `TestGRPCStatusCodeConditionalFallback` (8 sub-tests) |
| `transform/transform.go` | Delete `getOTelGRPCStatusCode`; call `LookupString` directly |

**Design:**

- Conditions read attributes by **exact key** — no transitive concept resolution. To gate on a renamed attribute, list one fallback row per condition attribute.
- The four `rpc.response.status_code` rows are: (string / int64) × (`rpc.system.name=grpc` / `rpc.system=grpc` AND `rpc.system.name` absent). The absence guard preserves the pre-refactor "new key wins" behavior.
- `Condition` schema: `{attribute, present, eq}`. Conditions in a `when` array are AND-ed.

**`version: "1.39.0"` provenance:** both new attributes are from OTel semconv [v1.39.0](https://github.com/open-telemetry/semantic-conventions/blob/v1.39.0/model/rpc/registry.yaml) (2026-01-12); neither is in [v1.24.0](https://github.com/open-telemetry/semantic-conventions/blob/v1.24.0/model/registry/rpc.yaml). Introducing commits: [`rpc.response.status_code`](open-telemetry/semantic-conventions@cd5b45c6) (cd5b45c6, 2025-12-04) and [`rpc.system.name`](open-telemetry/semantic-conventions@18db3e44) (18db3e44, 2025-12-09).

**Release note:** none added — user-visible behavior matches #50272's. The existing `apm-otlp-grpc-status-code-bf3137b34cb82136.yaml` reno still applies.

[DSEM-807]: https://datadoghq.atlassian.net/browse/DSEM-807?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

Co-authored-by: edward.hu <edward.hu@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/no-changelog No changelog entry needed internal Identify a non-fork PR medium review PR review might take time qa/rc-required Only for a PR that requires validation on the Release Candidate team/agent-apm trace-agent team/opentelemetry OpenTelemetry team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants